Predict new groups docs #734

GStechschulte · 2023-10-06T13:52:19Z

This PR adds docs for the new sample_new_groups arg. in model.predict() that was merged in PR #693. The notebook explains the motivation for the new arg. (related to hierarchical models) and how to use it to predict new groups either: directly with model.predict(), or with bmb.interpret.comparisons.

review-notebook-app · 2023-10-06T13:52:24Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

GStechschulte · 2023-10-07T07:41:21Z

The tests failing are related to PyMC issue #6941.

review-notebook-app · 2023-10-10T00:51:19Z

View / edit / reply to this conversation on ReviewNB

tomicapretto commented on 2023-10-10T00:51:19Z
----------------------------------------------------------------

Thanks for the very clear explanation in the second paragraph :)

GStechschulte commented on 2023-10-10T04:55:52Z
----------------------------------------------------------------

Thank you! :)

review-notebook-app · 2023-10-10T00:51:20Z

View / edit / reply to this conversation on ReviewNB

tomicapretto commented on 2023-10-10T00:51:19Z
----------------------------------------------------------------

I'm not familiar with LabelEnconder . Why do we use it here? If it just makes something categorical, you can always pass categorical=["patient", "smoking_status"] when you create the model instance. The reason I would like to avoid it, is to avoid the dependence on sklearn for the example.

Also, what is the reason to scale weeks and fvc? Is it to make things easier for the sampler?

GStechschulte commented on 2023-10-10T04:02:48Z
----------------------------------------------------------------

The patient ID's are long, e.g., ID00007637202177411956430 and I wanted it to be patient 1, 2, 3, etc. so it is "easier" to create a new patient. But yeah, we should avoid the dependence. I can create a label encoder function using numpy.

Yes, since weeks and fvc aren't on the same scale.

tomicapretto commented on 2023-10-10T11:09:30Z
----------------------------------------------------------------

Makes sense. I get it makes things easier later when you want to predict for particular individuals. Maybe this can be achieved by doing something like

series.map({series: np.arange(len(series))}.astype(object)

review-notebook-app · 2023-10-10T00:51:21Z

View / edit / reply to this conversation on ReviewNB

tomicapretto commented on 2023-10-10T00:51:20Z
----------------------------------------------------------------

I would add that we exclude the global intercept so smoking_status uses cell means encoding (i.e. the coefficient represents the mean of the group). And since we don't have a global intercept, we don't include a deflection around it, because it's simply not there. However, we do include a deflection around the weeks slope with weeks | patient.

GStechschulte commented on 2023-10-10T04:53:56Z
----------------------------------------------------------------

I am a bit confused on the "deflection" terminology. When you say "deflection" are you describing "variation"? For example,

However, we do include a deflection around the weeks slope with weeks | patient

Is like saying "the weeks slope is allowed to vary by individual patients"?

Edit: I just looked up deflection regarding statistical modelling:

"deflection" is often used to describe how coefficients (typically regression coefficients) deviate or vary from some reference point. It is a way to express how the effect of a predictor variable varies across different groups or levels of that variable.

tomicapretto commented on 2023-10-10T11:12:26Z
----------------------------------------------------------------

Is like saying "the weeks slope is allowed to vary by individual patients"?

Exactly. The slope for an individual "j" is "b_{week, j} = b_week + u_j". b_week is the common slope, while u_j is the deflection around that common slope for every individual j .

review-notebook-app · 2023-10-10T00:51:23Z

View / edit / reply to this conversation on ReviewNB

tomicapretto commented on 2023-10-10T00:51:22Z
----------------------------------------------------------------

Do we need to use this very large number of tune steps? I would increase the number of draws because of the autocorrelation. Also, do we need init="auto"?

GStechschulte commented on 2023-10-10T04:09:22Z
----------------------------------------------------------------

Nope, and increasing draws reduces the autocorrelation, and nope :)

review-notebook-app · 2023-10-10T00:51:24Z

View / edit / reply to this conversation on ReviewNB

tomicapretto commented on 2023-10-10T00:51:23Z
----------------------------------------------------------------

I would say something around the posteriors for week and weeks|patient , where we see that the slope can be very different for some individuals.

GStechschulte commented on 2023-10-10T04:20:07Z
----------------------------------------------------------------

Yup, good catch!

tomicapretto · 2023-10-10T00:51:37Z

@GStechschulte it's already in very good shape, just some minor comments.

GStechschulte · 2023-10-10T04:02:49Z

The patient ID's are long, e.g., ID00007637202177411956430 and I wanted it to be patient 1, 2, 3, etc. But yeah, we should avoid the dependence.

Yes, since weeks and fvc aren't on the same scale.

View entire conversation on ReviewNB

GStechschulte · 2023-10-10T04:09:23Z

Nope, and increasing draws reduces the autocorrelation, and nope :)

View entire conversation on ReviewNB

GStechschulte · 2023-10-10T04:20:08Z

Yup, good catch!

View entire conversation on ReviewNB

GStechschulte · 2023-10-10T04:53:57Z

I am a bit confused on the "deflection" terminology. When you say "deflection" are you describing "variation"? For example,

However, we do include a deflection around the weeks slope with weeks | patient

Is like saying "the weeks slope is allowed to vary by individual patients"?

Edit: I just looked up deflection regarding statistical modelling:

"deflection" is often used to describe how coefficients (typically regression coefficients) deviate or vary from some reference point. It is a way to express how the effect of a predictor variable varies across different groups or levels of that variable.

View entire conversation on ReviewNB

GStechschulte · 2023-10-10T04:55:53Z

Thank you! :)

View entire conversation on ReviewNB

GStechschulte · 2023-10-10T04:58:11Z

@GStechschulte it's already in very good shape, just some minor comments.

Thanks for the kind words and review! Much appreciated!

tomicapretto · 2023-10-10T11:09:31Z

Makes sense. I get it makes things easier later when you want to predict for particular individuals. Maybe this can be achieved by doing something like

series.map({series: np.arange(len(series))}.astype(object)

Edit Just saw that you already modified it.

View entire conversation on ReviewNB

tomicapretto · 2023-10-10T11:12:28Z

Is like saying "the weeks slope is allowed to vary by individual patients"?

Exactly. The slope is "b_{week, j} = b_week + u_j". b_week is the common slope, while u_j is the deflection around that common slope for every individual j .

View entire conversation on ReviewNB

tomicapretto · 2023-10-10T11:15:50Z

@GStechschulte looks perfect, thanks!

GStechschulte added 3 commits October 6, 2023 15:44

explicitly convert to categorical dtype instead of casting

8116f15

predict new groups docs

776ad2d

run black

32fceb2

change to non-centered model, update plots, and add. wording

729ceca

GStechschulte requested a review from tomicapretto October 7, 2023 06:28

Tomas's code reivew

415aef9

tomicapretto merged commit 3aaebca into bambinos:main Oct 10, 2023
1 of 4 checks passed

GStechschulte deleted the predict-groups-examples branch January 21, 2024 20:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Predict new groups docs #734

Predict new groups docs #734

GStechschulte commented Oct 6, 2023

review-notebook-app bot commented Oct 6, 2023

GStechschulte commented Oct 7, 2023 •

edited

Loading

review-notebook-app bot commented Oct 10, 2023 •

edited

Loading

review-notebook-app bot commented Oct 10, 2023 •

edited

Loading

review-notebook-app bot commented Oct 10, 2023 •

edited

Loading

review-notebook-app bot commented Oct 10, 2023 •

edited

Loading

review-notebook-app bot commented Oct 10, 2023 •

edited

Loading

tomicapretto commented Oct 10, 2023

GStechschulte commented Oct 10, 2023

GStechschulte commented Oct 10, 2023

GStechschulte commented Oct 10, 2023

GStechschulte commented Oct 10, 2023

GStechschulte commented Oct 10, 2023

GStechschulte commented Oct 10, 2023

tomicapretto commented Oct 10, 2023 •

edited

Loading

tomicapretto commented Oct 10, 2023

tomicapretto commented Oct 10, 2023

Predict new groups docs #734

Predict new groups docs #734

Conversation

GStechschulte commented Oct 6, 2023

review-notebook-app bot commented Oct 6, 2023

GStechschulte commented Oct 7, 2023 • edited Loading

review-notebook-app bot commented Oct 10, 2023 • edited Loading

review-notebook-app bot commented Oct 10, 2023 • edited Loading

review-notebook-app bot commented Oct 10, 2023 • edited Loading

review-notebook-app bot commented Oct 10, 2023 • edited Loading

review-notebook-app bot commented Oct 10, 2023 • edited Loading

tomicapretto commented Oct 10, 2023

GStechschulte commented Oct 10, 2023

GStechschulte commented Oct 10, 2023

GStechschulte commented Oct 10, 2023

GStechschulte commented Oct 10, 2023

GStechschulte commented Oct 10, 2023

GStechschulte commented Oct 10, 2023

tomicapretto commented Oct 10, 2023 • edited Loading

tomicapretto commented Oct 10, 2023

tomicapretto commented Oct 10, 2023

GStechschulte commented Oct 7, 2023 •

edited

Loading

review-notebook-app bot commented Oct 10, 2023 •

edited

Loading

review-notebook-app bot commented Oct 10, 2023 •

edited

Loading

review-notebook-app bot commented Oct 10, 2023 •

edited

Loading

review-notebook-app bot commented Oct 10, 2023 •

edited

Loading

review-notebook-app bot commented Oct 10, 2023 •

edited

Loading

tomicapretto commented Oct 10, 2023 •

edited

Loading